Many artificial intelligence applications often require a huge amount of computing\nresources. As a result, cloud computing adoption rates are increasing in the artificial intelligence\nfield. To support the demand for artificial intelligence applications and guarantee the service level\nagreement, cloud computing should provide not only computing resources but also fundamental\nmechanisms for efficient computing. In this regard, a snapshot protocol has been used to create a\nconsistent snapshot of the global state in cloud computing environments. However, the existing\nsnapshot protocols are not optimized in the context of artificial intelligence applications, where\nlarge-scale iterative computation is the norm. In this paper, we present a distributed snapshot protocol\nfor efficient artificial intelligence computation in cloud computing environments. The proposed\nsnapshot protocol is based on a distributed algorithm to run interconnected multiple nodes in a\nscalable fashion. Our snapshot protocol is able to deal with artificial intelligence applications, in which\na large number of computing nodes are running. We reveal that our distributed snapshot protocol\nguarantees the correctness, safety, and liveness conditions.
Loading....